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Abstract 

In  the  case  of  Zellner's  seemingly  unrelated  statistical  model  it 
is  well  known  that  the  efficiency  of  the  generalized  least  squares 
estimator  (GLSE)  relative  to  that  of  the  least  squares  estimator  (LSE) 
is  conditional  on  the  magnitude  of  the  correlation  between  the  equa- 
tion errors.   Using  a  relevant  test  statistic,  we  analytically  eva- 
luate the  risk,  characteristics  of  a  seemingly  unrelated  regressions 
pre-test  estimator  (SURPE)  that  is  the  GLSE  if  a  preliminary  test, 
based  on  the  data  at  hand,  indicates  that  the  correlation  between 
equation  errors  is  significantly  different  from  zero,  and  the  LSE  if 
we  accept  the  null  hypothesis  of  no  correlation.   The  small  sample 
distribution  of  the  test  statistic,  used  in  defining  SURPE  is  also 
derived. 

Key  Words:   Risk,  Pre-Test  estimator,  Least  squares  estimator, 

Generalized  least  squares  estimator,  Seemingly  unrelated 
regression  model,  Test  statistic. 


THE  RISK  PROPERTIES  OF  A  PRE-TEST  ESTIMATOR 
FOR  ZELLNER'S  SEEMINGLY  UNRELATED  REGRESSION  MODEL 


1.   Introduction 

Since  Zellner  (1962)  proposed  the  use  of  Aitker's  generalized 
least  squares  estimator  (GLSE)  for  a  set  of  disturbance  related 
regression  equations,  the  efficiency  of  this  estimator  relative  to 
that  of  the  least  squares  estimator  (LSE)  has  received  much  attention. 
For  the  uncorrelated  regressors  case,  Zellner  (1963)  derived  the  small 
sample  properties  of  the  seemingly  unrelated  regression  estimator 
(SURE)  and  noted  that  the  distribution  of  the  estimator  converges 
rapidly  toward  a  normal  density.   Mehta  and  Swamy  (1976)  derived  the 
exact  second  moment  matrix  of  Zellner 's  estimator  conditional  on  an 
estimate  of  the  variance-covariance  matrix  of  the  error  terms  and 
found  that  (i)  the  LSE  is  more  efficient  than  Zellner 's  estimator  if 
the  correlation  in  the  errors  of  the  two  equations  is  zero,  or  small 
and  (ii)  Zellner's  estimator  is  better  if  the  contemporaneous  correla- 
tion is  high  (also  see  Kunitomo  (1977)).   They  also  indicate  that  the 
gain  in  efficiency  in  using  Zellner's  estimator  is  especially  high 
when  the  equation  error  correlation  coefficient  is  close  to  one,  and 
the  loss  is  small  when  the  errors  are  mildly  correlated  and  the  degrees 
of  freedom  is  greater  than  12. 

In  this  paper,  we  examine  under  a  squared  error  loss  measure  the 
risk  of  the  seemingly  unrelated  regression  pre-test  estimator,  (SURPE), 
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which  is  the  GLSE  if  a  preliminary  test  indicates  that  the  correlation 
coefficient  is  significantly  different  from  zero,  and  the  LSE  if  we 
accept  the  null  hypothesis  of  no  correlation.   The  motivation  for  this 
research  comes  from  Zellner's  suggestion  that  it  is  possible  to  develop 
a  decision  procedure  for  deciding  whether  to  use  the  LSE,  or  the  GLSE. 

In  section  2,  we  present  the  statistical  model  and  the  various 
estimators.   Our  main  interest  is  to  derive  the  risk  function  of  the 
SURPE  with  respect  to  the  joint  distribution  of  the  test  statistic  r  = 


s  „//s   s  „  and  v  =  s  ./s.  ,  where  the  s..  (i,j  =  1,2),  which  are  de- 
fined later,  are  consistent  estimators  of  the  variances  and  the 
covariances  of  the  errors.   The  small  sample  distribution  of  r  as  a 
function  of  the  population  correlation  coefficient  <j>  is  given  in 
section  3.   The  marginal  distribution  of  r  is  obtained  from  the  joint 
distribution  of  r  and  v.   In  section  4,  we  derive  the  risk,  function  of 
the  SURPE  and  compare  it  with  those  of  LSE  and  GLSE.   Section  5  sum- 
marizes and  discusses  the  implications  of  the  paper. 

2.   Statistical  Model  and  Estimators 

Consider  the  following  two  sample  regression  model 


Zi 


xl  ° 

0   X, 


'  f 

r 

2i 

+ 

Si 

?2 

?2 

11 —     — 

1 —    -1 

,  or  y  =  Xa  +  e 


(2.1) 


where  y.  is  a    (nxl)  vector  of  observations,  X.  is  a  (nxp)  matrix  of 
fixed  regressors  of  rank  p,  a.  is  a  (pxl)  unknown  location  vector,  and 


e.  is  an  (nxl)  random  error  vector  for  i  =  1,2.   For  expository 
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purposes  we  assume  that  X  'X„  =  X0'Xn  =  0  .   We  further  assume  that 
v      *  1   2     2   1     p 

the  equation  errors  are  distributed  as  multivariate  normal  random 
variables  with  zero  means  and  covariance  matrix 


E  =  E 


?1 


e-2 


[e^  e2']  =  E[ee' 


a     I  a  I 
11  n   12  n 

o   I   a  I 
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(2.2) 


where  I   is  an  identity  matrix  of  dimension  n.   The  LSE  for  this  model 

n 

is 


a*(l)  = 


-1, 


(X^)   X1,y1 
(X2'X2)"1X2'y2 


(2.3) 


The  Zellner  SUR  estimator 


o*(2)  =  (X'E  LX)  1X'5:  *y 


(2.4) 


is  obtained  by  applying  Aitken's  GLSE  to  the  whole  system  (2.1).   The 
estimator  in  (2.4)  is  not  feasible  since  it  depends  on  unknown  param- 
eters of  the  £  matrix.   Replacing  £  by  a  consistent  estimator  S 
produces  Zellner's  feasible  GLSE,  a*(4).   One  choice  for  the  elements 


of  S  = 


Sll   S12 


-S21   S22 


is  s. .  =  -  (y.  -  X.a*(l))'(y.  -  X.a*(l)), 


i,j  =  1,2. 
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Now  the  feasible  GLSE  is  given  by 


a*(4)  = 


X  '  0 

o  x21 
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(\'\)  l   x1'y1  -  (s12/s1I)(x1,xl)  \'y2 

(X2'X2)_1  X2'y2  -  (s12/s22)(X2,X2)~1X2'y1 


(2.5) 


where  we  have  used  the  assumption  X  'X   =  X  'X   =  0   and  the  s    are 

12    2   1    p 


the  elements  of  S 
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The  estimates  of  the  variances  and 


the  covariances  are  obtained  from  the  restricted  residuals,  that  are 
obtained  from  regressing  y.  on  X.  (i=l,2),  i.e.,  implicitly  assuming 
$  =  0. 

The  SUR  pre-test  estimator  (SURPE)  is  based  on  the  test  statistic 


r  =  s.,//s,.s„  that  is  used  to  test  the  null  hypothesis  H  :   <$>  =  0 
12    11  22  0 

that  the  population  correlation  coefficient  <f>  is  zero,  versus  a  one- 
sided alternative  H  :   <J>  >  0.   We  reject  the  null  hypothesis  if  r  >  c, 

a 

where  c  is  the  critical  value  chosen  for  the  test.   If  we  suspect  a 
negative  correlation  then  we  reject  the  H  ,  if  r  <  -c.   A  two-sided 
alternative  can  also  be  set  up  and  this  would  of  course  have  impli- 
cations for  the  properties  of  the  implied  pretest  estimator.   This 
test  statistic  is  similar  to  the  locally  best  invariant  test  statistic 
given  by  Kariya  (1981)  and  the  Lagrange  multiplier  statistic  of 
Breusch  and  Pagan  (1980)  and  Shiba  and  Tsurumi  (1988).   The  pretest 
estimator  (Judge  and  Bock  (1978))  is  defined  as  follows:   if  we  accept 
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H  ,  the  SURPE  is  the  LSE,  and  otherwise  it  is  the  GLSE.   This  means 
o 

the  SURPE  is 


o*(3)  =  I      (r)o*(l)  +  I      (r)a*(4)  (2.6) 

[-l,c]  (c,+l] 

where  I,  N(*)  is  a  zero-one  indicator  function. 
(•) 

3.   The  Small  Sample  Distribution  of  r 

The  distribution  of  SURPE  a*(3)  and  hence  its  risk,  depends  on  the 
distribution  of  r.   Therefore,  in  this  section  we  derive  the  small 
sample  distribution  of  r.   First,  we  find  the  joint  distribution  of 
the  test  statistic  r  and  v.   It  is  well  known  that  ns   =x,  ns   =y  and 
ns   =z  are  distributed  according  to  the  Wishart  distribution  with 
covariance  matrix  £,  and  degrees  of  freedom  t  =  n-2p.   The  joint 
density  of  x,  y  and  z  is  given  by 

W(E,t)  =  k(xy-z2)(t"3)/2exp[-(x/an  -  l^z/Ja^o^   +   y / a n) /2(l-<fr2) 

(3.1) 

where  k  =  1/ [2C | I | t/2/^  T(t/2) T( (t-1 )/2) ].   In  the  evaluation  we  make 
a  transformation  from  the  variables  x,  y  and  z  to  r  =  z//xy ,  v  =  z/y 

and  w  =  z.   The  density,  in  these  new  variables  with  Jacobian  = 

2    3 '  1S 
2w  /vr 

e,  v   ,/,  2.   3W  2.  2   2,(t-3)/2 

f(r,v,w)  =  k(2w  /vr  )(w  /r  -w  ) 

exp{-w(v/anr2    -   2$//a^0^  +   l/a22v)/2(  1-<T  ) } 

(3.2) 

when   w,    v    e    R,    and    1    <    r    <    +1. 
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Due  to  the  nature  of  the  transformation,  the  density  in  (3.2)  is 
defined  only  when  r,  v,  w  are  either  all  positive  or  all  negative. 
As  we  see  later,  for  our  purpose,  it  is  sufficient  to  consider  only 
positive  values  of  r.   Therefore,  from  now  on,  we  consider  f(r,v,w) 
only  when  r,  v,  w  are  all  positive  and  this  means  we  assume  a  positive 
critical  value. 

To  obtain  the  joint  density  of  r  and  w,  we  integrate  out  w  by 
using  the  gamma  function 

f(r,v)  =  2k(l-r2)(t_3)/2r(t)/((v/r2a11-24>/alia22+l/va22)/2(l-*2))tvrt 

(3.3) 
If  we  define 

g  =  l/2(l-<t>2)an  >  0, 


h  =  -*/(l-4>  )/^11°22  G  R' 
and  q  =  l/2(l-<j>2)a 22  >  0,  (3.4) 

the  density  in  (3.3)  may  be  written  compactly  as 

f(r,v)  =  2k(l-r2)(t"3)/2r(t)vC~1/rt((gv2/r2)+hv+q)t 

=  2k(l-r2)(c"3)/2r(t)vt_1/rtgt((v2/r2)+(hv/g)+q/g)t 

=  2k(l-r2)(t~3)/2r(t)vt_1/rtgt[((v/r)+hr/2g)2+(q/g)-h2r2/4g2]t 

(3.5) 
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This  completes  Che  derivation  of  the  joint  density  of  r  and  v. 

To  obtain  the  marginal  distribution  of  r  we  define  m(r)  = 

2  2    2  1/2  2 

((q/g)-h  r  /4g  )    and  make  a  change  of  variable  in  v,  x  =  v  +  hr  /2g 

and  r  =  r.   This  gives 

f(r,x)  =  2k(l-r2)(C"3)/2r(t)(x-hr2/2g)t"1rt/gtU2+r2m(r)2)t 

ni  n/    Wl      2.(t-3)/2   t    t~1/t-lw   .     2.„    Nt-l-j    j.    t,    2^   2    ,    ,2.t 
=   2kT(t)(l-r    y  r        £    (    .    )(-hr    /2g)  JxJ/g    (x  +r   m(r)    ) 

j=0   J 

2 
where  x  >  hr  /2g  (3.6) 


Next  we  substitute  x  =  rm(r)  tan  9  and  obtain 

ft      Q)    -  2kr(t)(l-r2)(t"3)/2  t;1(t-l)(-hr2/2g)t"1"jsinjecosW~:ie 

t  _   j      ,    Nw-j+l  w/2-j 

g  j=0   J     m(r)     r 


where  w  =  2t-2  and  arctg(hr/2gra(r ) )  <  9  <  tt/2.  (3.7) 

To  integrate  out  Q,  we  use  successive  integration  by  parts.   This 
method  depends  heavily  on  j  being  even  since  the  reduction  from  the 
integration  by  parts  is  by  two  at  each  step.   Hence  we  distinguish  two 
cases  i)  j  is  even,  and  ii)  j  is  odd.   The  value  of  the  integrals  for 
even  j  is  given  by 
it/2 


I   =  /    (sin  0)J(cos  0)w  Jd0 

e* 


j/2 

=   I  (j-l)!!(-l)/(j-2i+l)!! 

i  =  l 


x  ((w-j-l)!!/(w-j-l+2i)!!)sin(9*)j+1  2lcos(G*)W  j  1+21 

tt/2   • 
+  (j-1)!  !(w-j-l)!  !/(w-l)!  !/    (cos  G)V'dG  (3.8) 

9* 
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where  Q*  =  arctg  hr/2gm(r)  and  ! !  means  double  factorial.   The 

integral  in  (3.8)  can  be  evaluated  by  using  the  value  given  in 

Gradshteyn  and  Ryznik  (1980). 

tt/2  _   t-2 

/    (cos  0)Wd6  =  ((1/2W)(  W  )Q*  +  1/2"  L)    Z    (*)sin{(w-2k)Q*/(w-2k)} 
Q*  Z  k=0 

(3.9) 

When  j  is  odd,  the  odd  terras  of  the  summation  indexed  by  j  in  (3.7) 

can  be  integrated  using 

tt/2 
I   =  /    (sin  G)J(cos  G)w  Jd0 

°    Q* 

1+1 

2 

=  Z    ((-l)(j-l)!!/(j-2i+l)!!) 
i=l 

((w-j-1)! !/(w-j-l+2i)!!)sin(Q*)j+1"2icos(Q*)W"j_1+2i 

(3.10) 

Finally  using  I   or  I   depending  on  whether  j  is  even  or  odd,  Q  in 
(3.7)  can  be  integrated  out  to  compute  the  marginal  distribution  of 
the  test  statistic.   This  is  given  by 

2(l-r2)(t-3)/2r(t)(l-*2)t/2  ti1(t-i)(,rt-1^(I  ,1  ,j)/(H2r2)t-1/2-3/2 
j=0  J l_2 

/7  r(t/2)r((t-i)/2) 

(3.11) 


f(r) 


where  (I  ,1  ,j)  means  that  we  pick  either  I   or  I   depending  on  whether 
e   o  e     o 

j  is  even  or  odd. 
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In  Figures  1  and  2,  this  distribution  is  plotted  as  a  function  of 
t=n-2p  and  <(>.   In  Figure  1  where  (j>  =  0,  the  distribution  is  symmetric 
for  t  =  10,  15.   The  distribution  for  the  larger  t  has  more  probabil- 
ity mass  around  zero,  but  goes  to  zero  faster  on  either  side  as  r 
differs  from  zero.   In  Figure  2,  we  show  for  t  =  15,  the  same  dis- 
tribution with  <)>  =  .  2  and  4>  =  .4.   Under  this  scenario,  as  <(>  gets 
larger  there  is  more  probability  to  the  right.   For  example, 
P(r>0|4>=.2)=.72,  whereas  (P(r>0|  <j>=.4)=.  88. 

4.   The  Risk  of  the  Pre-test  Estimator  (SURPE) 

Since  the  derivation  is  symmetric  and  the  calculations  for  the 
second  sample  are  exactly  similar,  we  can  reduce  the  dimensionality 
of  the  coefficient  vectors  by  two  without  affecting  the  results. 
Therefore,  henceforth  ct*(l),  a*(3)  and  a*(4)  are  (pxl)  vectors  of 
estimators  of  the  coefficients  of  the  first  sample  only.   Under 
squared  error  loss  the  risk  of  the  SURPE  is  given  by 

p(o*(3),a  )  =  trE||l      (r)o*(l)+I      (r)a*(4)-a  | |2 

[-l,c]  Cc,+1]    "     -1 

=  trE| | [I      (r)(X  'X  )_1X  'y   -  I      (r)o. ] 
[-l,c]     l      l  l    ~l  [-l,c]    -1 

+  [I       (r)f(X  'X  )_1X  'v   -  v(X  'X  )_1X  'y  } 

Cc+i]     L    l        l   _1     l    1        1  ~l 

-   I       (Oal||2  (4.1) 

(c,+l]    "X 

Using  (X1'X1)"1X1'y1  =  ^HX^X^X^e^    and  (X{  '  X{  )~lX{  'y2  = 
(X1'X1)~1X1'e2  we  have 
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FIG.  1.   THE  SMALL  SAMPLE  DISTRIBUTION  OF  R  (t=10,  15:  ~p=0) 
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FIG. 2.  THE  SMALL  SAMPLE  DISTRIBUTION  OF  R  (t=15:  $=0.2,  0.4) 
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p(a*(3),a1)  =trE||[I      (r)(X1  '^T^l  '^ 

[-l,c] 

+  I      (r)(X  '^)~\  'e 
(c,+l]     X   i     l    ~l 


-   I      (r)v(X  'X  )  1X1 'e  ]||2 
(c,+l]      l      l  l    ~L 


=  trE||(X'X. )  V'e.  -  I      (r)v(X  'X  )  LX  »e  | |2  (4.2) 
11     L    ~L  (c,+l] 


where  we  use  the  fact  that  I,  ,   ,(r)  +  I,   . ,  •.  (r)  =  1,  since  r  e  [-1,1]. 

l~l,cj       (,c,+lj 

Also,  because  the  domains  of  the  indicator  functions  are  disjoint,  this 

means  that  Ir  ,   ,(r)I,   ,nl(r)  =  0  and  we  obtain 
[-l,cj     (c,+lj 

p(a*(3),a1)    =  a11tr(X1*X1)"1 


»X1)   1X1'e1e2'X1(X1'Xi)    l 

?  -l  -1 


-   2trE{l  (r)v(X    'X    )    XX    'e. e„ 'X    (X    'X    )    l] 

(c,+l]  l    ■  L    ~l~l      l      l      L 


+trE{l  (r)v   (X.'X.)    LX    'e_e    'X.  (X    'X    )    X}    (4.3) 

(c,+l]  X      X  X    ~Z"Z      X      X      l 

Using  the  independence  of  the  following  vectors,  (a*(l),  (X  'X  )   X  'y  , 

(X  'X  )   X  'y  )  and  the  scale  parameter  estimates  (s...,  s   ,  s   ),  yields 

p(a*(3),aL)  =  ontr(X1'X1)"1 

-  2E{l(C)+1](r)v}trE{(X1'X1)"1X11e1e2'X1(X1*X1)"1} 

+  E{l      (r)v2}trE{(X  'X.  )_1X  'e9e ,'X  (X  'X  )_1 } 
(c,+l]  X  • 

=  a  tr(X.  »X.  )_1  -  2a  E{l      (r)v}tr(X  'X  )_1 
1  l  (c,+l] 

+  a   tr(X  'X  )_1E{l      (r)v2}  (4.4) 

1  L  (c,+l] 
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In  order  to  compare  the  risks  of  SURPE ,  Zellner's  GLSE  and  LSE,  all 
risk  evaluations  are  made  with  respect  to  the  LSE  risk,  a     tr(X  'X  ) 
Therefore,  the  relative  risk  is 

p(S*(3),a  )  . 

i    ±TT\      \    =  l~2   E1I      (r)v}(a  /a      )   +   E{l      (r)v}(a  Jo      )  (4.5) 
PC^Cl).^)  (Cf+1]       l2   11       (Cj+1]        22   11 

Here  we  should  note  that  the  r  in  the  argument  of  the  indicator  func- 
tion in  (4.5)  is  positive  unless  we  choose  a  negative  value  of  c. 
That  is  why,  in  section  2  the  joint  distribution  f(r,v,w)  is  considered 
only  for  the  positive  values  of  r,  v  and  w  [see  equation  (3.2)]. 

The  relative  risk  values  of  the  SURPE  with  respect  to  that  of  LSE 
are  given  as  a  function  of  the  population  correlation  coefficient  $  and 
the  critical  value  of  the  test  c,  in  Table  1,  for  t  =  10,  15,  and  20 
respectively,  when  a        =  a„„  =  1.   These  values  are  obtained  by 
calculating  the  expectations  in  (4.5)  with  respect  to  the  joint 
distribution  of  r  and  v  that  is  derived  in  Section  3. 

From  the  tabled  values  of  the  relative  risk  of  SURPE,  that  is  a 
function  of  <j>  and  the  critical  value  c  used  in  the  preliminary 
testing,  we  notice  that  over  the  range  of  the  (4>,c)  parameter  space, 
the  relative  risks  of  the  pretest  estimators  cross.   As  larger  and 
larger  critical  values  are  used,  the  LSE  is  used  more  frequently  and 
this  causes  the  relative  risk  of  the  SURPE  to  decrease  for  $  close  to 
zero,  and  to  increase  for  <j>  close  to  one.   The  effect  of  degrees  of 
freedom  on  these  results  is  minimal. 

The  critical  values  of  the  SURPE  for  significance  levels  .05  and 
.10  are  respectively  .60  and  .45.   The  relative  risks  of  LSE  and 
Zellner's  GLSE  for  t  =  10  are  presented  in  Figure  3.   The  risk  values 
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TABLE  1 


Relative  risk  values  of  SURPE  as  a  function  of  the  population 
correlation  coefficient  <f»  and  the  critical  value  c 


10 


t  =  15 


t  =  20 


c_ 

.1 

.3 

.5 

.7 

.9 

9 

1.0004 

1.0009 

1.0002 

0.9775 

0.5551 

8 

1.0040 

1.0072 

0.9967 

0.8753 

0.3030 

7 

1.0133 

1.0180 

0.9803 

0.7652 

0.2413 

6 

1.0273 

1.0273 

0.9517 

0.6837 

0.2247 

5 

1.0425 

1.0303 

0.9187 

0.6332 

0.2196 

4 

1.0552 

1.0263 

0.8887 

0.6050 

0.2179 

3 

1.0630 

1.0178 

0.8660 

0.5907 

0.2174 

0 

1.0648 

0.9997 

0.8426 

0.5815 

0.2172 

9 

1.0000 

1.0000 

1.0000 

0.9924 

0.5623 

8 

1.0001 

1.0005 

0.9870 

0.8163 

0.2563 

7 

1.0017 

1.0041 

0.9807 

0.7554 

0.2129 

6 

1.0064 

1.0085 

0.9436 

0.6459 

0.2128 

5 

1.0146 

1.0085 

0.8967 

0.5880 

0.2048 

4 

1.0240 

1.0011 

0.8553 

0.5626 

0.2047 

3 

1.0310 

0.9885 

0.8271 

0.5530 

0.2046 

0 

1.0307 

0.9651 

0.8049 

0.5491 

0.2046 

9 

1.0000 

1.0000 

1.0000 

0.9972 

0.5665 

8 

1.0000 

1.0002 

0.9987 

0.9192 

0.2348 

7 

1.0004 

1.0015 

0.9848 

0.7528 

0.2200 

6 

1.0022 

1.0040 

0.9450 

0.6266 

0.2195 

5 

1.0070 

1.0031 

0.8979 

0.5675 

0.2135 

4 

1.0143 

0.9942 

0.8413 

0.5465 

0.2090 

3 

1.0207 

0.9790 

0.8107 

0.5402 

0.2088 

0 

1.0212 

0.9524 

0.7907 

0.5376 

0.2086 

1.2 
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FIG. 3.   RISK  VALUES  OF  SURPE  ESTIMATORS   (t=10] 
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of  Zellner's  estimator  are  taken  from  Zellner's  paper  (1963,  p.  983). 
We  observe  that  the  relative  risk  of  the  SURPE  with  c  =  .60,  starts 
below  that  of  c  =  .45,  crosses  the  latter  around  ij>  =  .3,  and  remains 
above  for  all  <+>  >  .3.   This  means  that  throughout  the  (c,v)  parameter 
space,  no  one  SURPE  is  risk  superior  to  the  other.   The  SURPE  with 
c  =  .6  is  risk  superior  to  SURPE  with  c  =  .45,  for  4>  close  to  zero. 
In  turn  it  is  risk  inferior  once  4>  exceeds  .3.   This  relationship 
between  the  SURPE 's  with  different  critical  values  holds  true 
throughout.   In  general,  as  can  be  observed  from  Table  1,  the  SURPE 
with  a  larger  critical  value  has  a  small  sampling  variability  when  4> 
is  small,  but  then  performs  worse  after  its  risk  crosses  that  of  the 
SURPE  with  a  smaller  critical  value. 

The  relative  risk  function  of  Zellner's  GLSE  is  also  presented  in 
Figure  3.   Its  risk  is  highest  for  small  4>,  and  then  crosses  the  risks 
of  LSE,  SURPE  (c=.  6)  and  finally  SURPE  (c=.  45)  as  <f>  gets  larger. 
Therefore,  under  squared  error  loss,  none  of  the  estimators  in  Figure 
3  dominates.   However,  it  is  interesting  to  note  that  there  is  a  range 
of  <}>  where  SURPE  is  better  than  both  LSE  and  GLSE.   This  is  not  the 
case  in  the  regression  coefficient  pretesting.   A  possible  reason  for 
this  might  be  the  fact  that  0  <_  <t>  £  1  prevents  the  pretest  from  making 
any  disastrous  type  I  and  type  II  errors.   The  SURPE  with  0  <  c  <  1 
at  cf>  =  0  starts  with  a  risk  in  between  that  of  the  LSE  and  the  GLSE. 
It  ends  with  a  risk  in  between  these  two  estimators  when  <$   =  1.   One 
can  also  see  that  the  SURPE  has  a  substantial  risk  gain  over  the  LSE 
for  large  ■+>,  and  the  risk  loss  is  modest  when  <J>  is  close  to  zero. 
When  the  critical  value  c  takes  on  extreme  values,  the  risk  of  SURPE 
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approaches  the  risk,  of  the  LSE  or  the  risk,  of  the  GLSE  depending 
whether  c  tends  to  1  or  to  -1.   Similar  comparisons  can  be  made  for 
the  same  estimators  in  Figure  4  with  t  =  20  where  the  critical  values 
.5  and  .35  correspond  to  significance  levels  .05  and  .1  respectively. 
As  t  increases,  Zellner's  GLSE  becomes  more  efficient,  and  in  fact 
approaches  asymptotic  efficiency  levels. 

5.   Summary  and  Conclusions 

We  have  made  risk  comparisons  between  the  SURPE,  LSE  and  Zellner's 
GLSE  in  the  two  sample  seemingly  unrelated  regression  model  and  found 
that  no  one  estimator  is  uniformly  superior.   However,  we  can  now 
determine  the  risk  gains  that  accrue  when  the  pre-test  estimator  is 
used  to  take  advantage  of  the  risk  superiority  of  LSE,  when  <j>  is  close 
to  zero,  and  the  GLSE  is  used  when  $  is  close  to  1.   Alternatively,  we 
can  determine  the  risk  consequences  of  always  using  the  pre-test  rule. 
Finally,  we  examined  the  distribution  of  the  test  statistic  r,  evalu- 
ated some  probabilities  by  numerical  methods  and  found  that  the 
distribution  when  <$>   =  0,  is  symmetric  around  zero,  but  skewed  to  the 
left  for  0  >  0. 

The  applied  statistician  can  gain  insight  into  the  nature  of  the 
correlation  of  disturbances  of  the  underlying  model,  by  conducting  a 
preliminary  test.   Consequently  in  many  situations,  the  risk  advan- 
tages of  the  SUR  pre-test  estimator  over  the  LSE  and  the  GLSE  can  be 
exploited.   For  example  in  a  somewhat  different  context,  Stanek  (1988) 
considers  an  experimental  design  which  permits  a  variety  of  hypotheses 
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FIG. 4.   RISK  VALUES  OF  SURPE  ESTIMATORS   (t=20] 
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to  be  tested  some  of  which  use  SUR  estimation  to  reduce  variances. 
The  SURPE  procedure  could  be  used  to  determine  if  SUR  estimation  is 
justified. 
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